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Directory Description with Namaste Tags 

Abstract 

Namaste (NAMe AS TExt) is a file naming convention to support 
primitive directory-1 eve I metadata tags exposed directly via filenames. 

As such, Namaste tags greet visitors who request a directory listing 
(e.g., Linux 'Is') with a glimpse of what the directory holds. An important 
use is to declare a directory's "type", somewhat like a file's "magic 
number". A Namaste tag, D=tvalue, is usually a tag value preceded by a 
single-digit name, designed so that ordinary directory listings will tend to 
display tags, if present, in a block near the top. Tag names (digits) are 
currently reserved for directory type (0) and the four DC Kernel 
elements (1-4). Restrictions on filename characters and lengths may 
result in a tag value that is a lossy representation of the complete 
metadata value, assumed to be the content of tag file. 


1. What kind of directory is this? 

Identifying the type of a file is generally much easier than identifying the 
type of a directory (or folder). A file's type is often carried in-band as a 
unique multi-byte sequence occurring at or near the beginning of the 
file, sometimes called its "magic number". 

A directory, however, has no equivalent typing mechanism, despite its 
being a natural container for any digital object complex enough to span 
a file group or file hierarchy. Namaste tags address this with a kind of in- 
band directory magic number that appears at or near the beginning of a 
typical directory listing. Effectively, they establish a well-known location 
within a directory for software and human users to discover what kind of 
directory one is dealing with. 


2. Namaste tag basics 







A Namaste ("name as text") tag is a metadata element that represents 
an element name and value directly in the name of a file. The typical 
form of the filename is designed so that elements should appear as a 
group near the top of a typical directory listing (e.g., using Linux 'Is'). 
There they greet the visitor, serving as labels that are quickly noticeable 
to users and easy to find with software. Here's a sample directory listing, 
the first entry of which is a Namaste tag declaring this directory's "type" 
to be "Bagit Version 0.96". 


0=bagit 0.96 

bag-info.txt 

fetch.txt 

bagit.txt 

data/ 

manifest-mdS.txt 


This specification defines the form and meaning of Namaste tags but an 
application may otherwise determine how to will use them. For example, 
a tool that processes "widget 1.3" directories might require the presence 
of a "0=widget_1.3" file. A tag's form, as a filename, is 

D=tvalue 

where D is usually a single decimal digit representing the tag name and 
tvalue is a string representing the tag value. What's inside the tag file 
named "D=tvalue" is the full value string from which tvalue is derived. 
For example, on a Linux system. 


% cat 0=widget_l.3 
widget 1.3 


In general, tvalue is the result of a transformation that may remap 
filesystem-unsafe characters and shorten the full value string. Here's 
another example illustrating one approach to this transformation. 


% Is 




0=dflat 1.8 

admin/ 

splash.txt 

v005/ 

l=Twain, Mark 

annotations/ 

vOOl/ 

v006/ 

2=Huckleberry.. 

data/ 

v002/ 

v007/ 

3=1898 

enrichment/ 

v003/ 

v008/ 

4 = 12345678901. . 

% cat ?=* 
dflat 1.8 

Twain, Mark 
Huckleberry Finn 
1898 

manifest.txt 

v004/ 


12345678901123456 




The purpose of Namaste tags is to help a human being get a glimpse of 
what the containing directory is about. Should it be needed, a tag file's 






content provides a complete element value without further parsing. 
There is no other machine-readability requirement. 


3. Basic and extended tag names 

The basic Namaste tag name is a single-digit. The tag name that 
specifies the directory type is 0, and tag names 1-4 correspond to Dublin 
Core (DC) [RFC5013] Kernel Metadata [Kernel] elements hl-h4. The 
currently defined tag names are summarized below. 

o=type — directory type string ("magic number") 
i=who — who created, published, or contributed to it 
2=what — what the expression was called (DC Title) 

3=when — when it was expressed (DC Date) 

4=where — where to find the expression (DC Identifier) 

These tag names were conceived with default sorting order in mind so 
that directory listings would tend to display tags, if present, in a block 
near the top. Note that sorting is typically locale-sensitive, sometimes 
with results that are not immediately obvious when a directory contains 
other filenames that begin with digits. 

An extended Namaste tag name is an arbitrary multi-character string of 
letters, digits, and underscores ('_') that starts with a letter, underscore, 
or period ('.'). Extended tag names will tend not to have the same 
display and grouping features as single-digit tag names. 

This specification does not currently define any extended names. 
Applications that wish to define names that won't conflict with future 
defined Namaste tag names should begin theirs with "x_". 

Namaste tags have no formal relationship to filesystem-supported 
key/value metadata, such as XFS extended attributes. 


4. Transforming metadata values into tag values 

As mentioned, the tag value, tvalue within the name "D=tvalue", is in 
general the result of a transformation of the full value string found inside 
the tag file. That transformation, Tr, may remap filesystem-unsafe 
characters and shorten the full value string, fvalue, in creating tvalue. 

Tr(ContentOfFileNamed("D=tvalue")) = Tr( fvalue) = tvalue 




The transformation process can be very flexible as it is entirely for the 
benefit of human users. An application that creates Namaste tags is thus 
free to transform values differently depending on the element and the 
audience. For example, it might deem element 4 ("where") to be too 
valuable ever to truncate, regardless of the consequences for display. As 
with any tranformation, some fvalues may produce no change (i.e., 
tvalue is the same as fvalue)) this was the case for the tag 3=1898 in the 
previous example. 

Two common aspects of transformation are character re-mapping and 
string shortening. Re-mapping is necessary to avoid characters in tvalue 
that would be illegal in a filename, such as '/'. Re-mapping some legal 
characters may make it easier to manipulate files (e.g., changing spaces 
to underscores). If platform independence is desired in contemporary 
filesystems (Unix and Windows), the following characters found in fvalue 
should be avoided: 


"*/:<>? \ I 


Shortening strings may be necessary for convenient display of a multi- 
column directory listing. For long strings, shortening may also be 
necessary because of maximum filename length restrictions (e.g., 255 
characters in Windows). Shortening may occur anywhere that it is most 
appropriate. A common technique is to substitute the least significant 
characters with an ellipsis (".." or "...") to indicate missing characters. 
Applications may choose to truncate at the right, left, or middle of a 
string, and vary truncation length depending on the element. 


5. Namaste directory types 

This specification defines an extendable register of directory types in 
Table 1. Within this register, n.m refers to major and minor specification 
version numbers. 


Directory Type 

Reference 

0=bagit N.M 

[BAGIT] 

0=can N.M 

[CAN] 

0=dflat N.M 

[DFLAT] 

0=pairtree N.M 

[PAIRTREE] 












[REDD] 


0=redd N.M 


Table 1: Namaste Directory Types 


Additional types will be defined and may be submitted by sending email 
to the author. Submissions should conform to these guidelines to reduce 
the need for lossy transformation when creating the tvalue\ proposed 
strings (fvalues) should not exceed 16 characters in length and should 
contain only letters, digits, underscores, periods, and hyphens. 


6. Tag file content 

Once a directory has been read, its Namaste tags, as filenames, are 
available without an extra disk read. While this provides efficient access, 
the filename metadata in tvalue is often lossy. To mitigate this situation, 
the tag file's content, fvalue, should be a lossless, newline-terminated, 
plain text representation, using UTF-8 [RFC3629], of tvalue. 

The terminal newline (LF hex Oa), or its equivalent (CR hex Od or CRLF 
hex OdOa), is for convenient editing and display of a metadata value in 
line-oriented systems such as Unix, and should be trimmed by 
applications that require a strict sense of the fvalue. 
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